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Abstract. The Hardy-Littlewood-Polya majorization theorem is extended to 
the framework of some spaces with a curved geometry (such as the global NPC 
spaces and the Wasserstein spaces). We also discuss the connection between 
our concept of majorization and the subject of Schur convexity. 



In 1929, G. H. Hardy, J. E. Littlewood and G. Polya [9], [10] have proved an 
important characterization of convex functions in terms of a partial ordering of 
vectors x — (xi, x n ) in W 1 . In order to state it we need a preparation. We 
denote by x^ the vector with the same entries as x but rearranged in decreasing 
order, 

x\>--->xl 

Then x is weakly majorized by y (abbreviated, x -<» y) if 

k k 

(1) E^E^ forfc=l,...,n 

i= 1 i=l 

and x is majorized by y (abbreviated, x -< y) if in addition 

n n 

(2) E4 = E^- 

i = 1 i=l 

Intuitively, x -< y means that the components in a; are less spread out than the 
components in y. As is shown in Theorem 1 below, the concept of majorization ad- 
mits an order-free characterization based on the notion of doubly stochastic matrix. 
Recall that a matrix A € M n (R) is doubly stochastic if it has nonnegative entries 
and each row and each column sums to unity. 

Theorem 1. (Hardy, Littlewood and Polya [9], Theorem 8). Let x and y be two 

vectors in R™ , whose entries belong to an interval I. Then the following statements 
are equivalent: 

i) x ^y; 

ii) There is a doubly stochastic matrix A — (djj)i<i',j< n such that x = Ay; 

Hi) The inequality X)"=i f( x i) — Y^i=i f{Vi)> holds for every continuous convex 
function f : I — > R. 

The proof of this result is also available in the recent monographs !l5] and [18] . 
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Remark 1. M. Tomic |25] and H. Weyl |26j have noticed the following character- 
ization of weak majorization: x -<* y if and only i/53j=i f( x i) — S"=i f or 
every continuous nondecreasing convex function f defined on an interval containing 
the components of x and y. The reader will find the details in [15], Proposition B2, 
p. 157. 

Nowadays there are known many important applications of majorization to ma- 
trix theory, numerical analysis, probability, combinatorics, quantum mechanics etc. 
See [3], [15], [18], [21], and [22]. They were made possible by the constant growth 
of the theory, able to uncover the most diverse situations. 

In what follows we will be interested in a simple but basic extension of the concept 
of majorization as was mentioned above: the weighted majorization. Indeed, the 
entire subject of majorization can be switched from vectors to Borel probability 
measures by identifying a vector x = (xi, ...,i„) in R n with the discrete measure 
i Yl7=i ac ting on R. By definition, 

- y\ s Xi -< - y] s yi 

i=l i=l 

if the conditions (1) and (2) above are fulfilled, and Theorem 1 can be equally seen 
as a characterization of this instance of majorization. 

Choquet's theory made available a very general framework of majorization by 
allowing the comparison of Borel probability measures whose supports are contained 
in a compact convex subset of a locally convex separated space. The highlights 
of this theory are presented in [22] and refer to a concept of majorization based 
on condition Hi) in Theorem 1 above. Of interest to us is the particular case of 
discrete probability measures on the Euclidean space Mr, that admits an alternative 
approach via condition ii) in the same Theorem 1. Indeed, in this case one can 
introduce a relation of the form 

m n 

(3) ^jA,f), - ^ /' ,').; 

1=1 j=l 

by asking the existence of a m x rt-dimcnsional matrix A = (dij)ij' such that 

(4) dij > 0, for all i, j 

a 

(5) y^a»j =1, i = l,...,m 

j'=l 

n i 

(6) /'.. ^J",,A,. j = l,...,n 
and 



i=i 



(7) X/'''^'- i = 1 >-) n 

The matrices verifying the conditions (4)&(5) are called stochastic on rows. 
When m = n and all weights Xi and fij are equal, the condition (6) assures the 
stochasticity on columns, so in that case we deal with doubly stochastic matrices. 
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The fact that (3) implies 



m n 



«=1 3=1 



for every continuous convex function / defined on a convex set containing all points 
Xi and yi, is covered by a general result due to S. Sherman [23]. See also the paper 
of J. Borcea [5] for a nice proof and important applications. 

It is worth noticing that the extended definition of majorization given by (3) is 
related, via equality (7), to an optimization problem as follows: 



The aim of the present paper is to discuss the analogue of the relation of ma- 
jorization (3) within certain classes of spaces with curved geometry. We will start 
with the spaces with global nonpositive curvature (abbreviated, global NPC spaces). 
The subject of majorization in these spaces was touched in [T7j via a different con- 
cept of majorization. Central to us here is the generalization of Theorem 1. 

Definition 1. A global NPC space is a complete metric space M = (M, d) for 
which the following inequality holds true: for each pair of points xq,x\ G M there 
exists a point y € M such that for all points z G M, 



These spaces are also known as the Cat spaces. See [6]. In a global NPC 
space, each pair of points Xq, X\ G M can be connected by a geodesic (that is, by a 
rectifiable curve 7 : [0, 1] — > M such that the length of 7|r s ,t] is d( , y(s) ) j(t)) for all 
< s < t < 1). Moreover, this geodesic is unique. 

In a global NPC space, the geodesies play the role of segments. The point y that 
appears in Definition 1 is the midpoint of xq and x\ and has the property 

d(x ,y) = d(y,xi) = ^d(x ,X!). 

Every Hilbert space is a global NPC space. Its geodesies are the line segments. 
The upper half-plane H= {z£C: Im z > 0}, endowed with the Poincare metric, 

2 dx 2 + dy 2 
ds = - , 

constitutes another example of a global NPC space. In this case the geodesies are 
the semicircles in H perpendicular to the real axis and the straight vertical lines 
ending on the real axis. 

A Riemannian manifold (M, g) is a global NPC space if and only if it is com- 
plete, simply connected and of nonpositive sectional curvature. Besides manifolds, 
other important examples of global NPC spaces are the Bruhat-Tits buildings (in 
particular, the trees). See [6]. More information on global NPC spaces is available 
in [2], [12], and [24]. See also our papers [17] and [20] . 



Definition 2. A set C C M is called convex if 7([0, 1]) C C for each geodesic 
7 : [0, 1] -t M joining 7(0), 7(1) G C. 




for i — 1, m. 



(8) 



d 2 (z, y) < -jd 2 {z, xq) + -^d 2 (z, xi) - -d 2 (x , xx). 
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A function (p : C — > M. is called convex if C is a convex set and for each geodesic 
7 : [0, 1] — > C the composition (po~/ is a convex function in the usual sense, that is, 

¥>(7(*)) < (1-*M7(0))+M7(1)) 

for allte [0,1]. 

The function (p is called concave if —f is convex. 

The distance function on a global NPC space M — (M, d) verifies not only the 
inequality ([8]), but also the following stronger version of it, 

d 2 (z, Xt) < (1 — t)d 2 (z, xq) + td 2 (z, x\) — t(l — t)d 2 (xo, x\); 

here z is any point in C and xt is any point on the geodesic 7 joining xo, x\ e C. In 
terms of Definition^ this shows that all the functions d 2 (-, z) are uniformly convex. 
In particular, they are convex and the balls are convex sets. 

In a global NPC space M = (M, d) the distance function d is convex onMxM 
and also convex are the functions d(-,z). See 24 , Corollary 2.5, for details. 

Recall that the direct product of metric spaces Mi = (Mi, di) (i — 1, n) is the 

nn 
Mi and 
i—l 



1/2 

d M {x,y) = \ ^di(xi,yi) 2 ' 



It is a global NPC space if all factors are global NPC spaces. 

When x\, x rn , y%, y n are points of a global NPC space M, and Ai, A m £ 
[0, 1] are weights that sum to 1, we will define the relation of majorization 

rn n 

(9) A - ,i ' -< /'.a 

«=i 3=1 

by asking the existence of an m x n-dimensional matrix A = (ciij)i : j that is stochastic 
on rows and verifies the following two conditions: 



(10) /', /^"../A,. j = l,...,n 

and 



=1 



1 

(11) x t = arg min - V a l3 d 2 (z, yj ), i = 1, m. 

z£M Z * — ' 

3=1 

The existence and uniqueness of the problems of optimization (11) is assured by 
the fact that the objective functions are uniformly convex and positive. See |12) . 
Section 3.1, or [24], Proposition 1.7, p. 3. 

Notice that the above definition agrees with the usual one in the Euclidean case. 
It is also related to the definition of the barycenter of a Borel probability measure 
/i defined on a global NPC space M. Precisely, if /1 G V2(M) (the set of those 
probability measures under which all functions d 2 (-,z) are integrable), then its 
barycenter is defined by the formula 

bar(/i) = arg min — f d 2 (z , x)d/u,(x) . 



z£M 2 
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This definition, due to E. Cartan [7], was inspired by Gauss' Least Squares Method. 
A larger approach of the notion of barycenter is offered by the recent paper of Sturm 

EH- 

The particular case of discrete probability measures A = Xi5 Xi is of special 

interest because the barycenter of A can be seen as a good analogue for the convex 
combination (or weighted mean) X\X\ + ■ • • + X n x n . Indeed, 

1 - 

bar(A) = arg mm — Xid 2 {z 7 xt), 

Z i=l 

and the way bar(A) provides a mean with nice features was recently clarified by 
Lawson and Lim |14) . As an immediate consequence one obtains the relation 



Jbar(A) 



-< A. 



A word of caution when denoting bar(A) as AiXi + • • • + X n x n . Probably a 
notation like Aia;i ffl • • • ffl X n x n suits better because bar(A) can be far from the 
usual the arithmetic mean. Consider for example the case where M is the space 
Sym ++ (n,R) (of all positively definite matrices with real coefficients), endowed 
with the trace metric, 

d tracc {A,B) = ( Xj lo S 2 X A 

where Ai, . . . , A„ are the eigenvalues of AB~ X . In this case 

-A ffl -B = A^iA-^BA- 1 ' 2 ) 1 '^ 1 ' 2 , 
2 2 V ' 

that is, it coincides with the geometric mean of A and B. See [3], Section 6.3, or 
[13], for details. 

Since the convex combinations within a global NPC space lack in general the 
property of associativity, 



71+1 / n . \ 

XiXi = (1 - A n+ i) ^ Xi + Xn+ix 

i=l \ l= l 1 _ A "+! / 



n+1, 



the proof of Jensen's inequality is not trivial even in the discrete case. This explains 
why this inequality was first stated in this context only in 2001 by J. Jost [TT] . We 
recall it here in the formulation of Eells and Fuglede jS], Proposition 12.3, p. 242: 

Theorem 2. (Jensen's Inequality). For any lower semicontinuous convex function 
f : M — > R and any Borel probability measure \x £ V2(M) we have the inequality 

/(bar(AO) < / f(x)dn(x), 

J M 

provided the right hand side is well-defined. 

The proof of Eells and Fuglede is based on the following remark concerning 
barycenters: If a probability measure fi is supported by a convex closed set K, then 
its barycenter bar(q) lies in K . A probabilistic approach of Theorem 2 is due to 
Sturm El- 
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An immediate consequence of Theorem 2 is the following couple of inequalities 
that work for any points z,xi, ...,x n , yi, ■•■,2M in a global NPC space: 

1 Xl m...m 1 - Xn , z )< d2 ^ z) + - + d2{x - z) 



and 



n n n n J n 



The next theorem offers a partial extension of Hardy-Littlewood-Polya Theorem 
to the context of global NPC spaces. 

Theorem 3. // 

m n 

*=i j=i 

in the global NPC space M, then, for every continuous convex function f defined 
on a convex subset U C M containing all points Xi and yj we have 

m n 

i=l j=l 

Proof. By our hypothesis, there is an m x n-dimensional matrix A — (ffly)t j that 
is stochastic on rows and verifies the conditions (10) and (11). The last condition, 
shows that each point Xi is the barycenter of the probability measure Ylj—x ^ij5 yj , 
so by Jensen's inequality we infer that 

n 

f{xi) <X] fl u/fe)- 

J = l 

Multiplying each side by Xi and then summing up over i from 1 to m, we conclude 
that 



n / m 



= J2 \ J2 a ^ Xi ) f(vj) 
j=i \i=i ) 

n 

□ 

In a global NPC space the distance function from a convex set is a convex 
function. See [23], Corollary 2.5. Combining this fact with Theorem[3]we infer the 
following result. 

Corollary 1. // 

m n 

i=i j=i 

and all coefficients Xi are positive, then {x\, ...,x m } is contained in the convex hull 
of{yi, -,Vn}- 
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In particular, the points x% spread out less than the points yj . 

Another application of Theorem [3] yields a new set of inequalities verified by the 
functions d(-, z) in a global NPC space M. These functions are convex and the same 
is true for the functions f{d{-, z)) whenever / is a continuous nondecreasing convex 
function defined on R+. According to TheoremEl if ^ J27=i ^< « SiLi m M, 
then Y^i=i f(d(xi 7 z)) < Y17=i f{d(Vi> z ))- Taking into account Remark 1 we arrive 
at the following result: 

Corollary 2. // ± £? =1 S Xi -< £ E?=i ^ *" M = (M, d), iften /or a/Z z G M, 

(d(a;i,z), ...,d(x„,z)) z), d(y„, z)) 

According to a result due to Ando (see [15], Theorem B.3a, p. 158), the converse 
of Corollary [2] works when M = R. 
The entropy function, 

H(t) = -ilogt, 

is concave and decreasing for t G [1/e, oo), so by Corollary [2] we infer that 

whenever ^ X)"=i ^ ^ « Sf=i anc ^ an the points iEj and are at a distance 
> 1/e from z. 

Many other inequalities involving distances in a global NPC space can be derived 
from Corollary [2] and the following result due to Fan and Mirsky: if x, y G K™ , then 
x -<„ y if and only if 

$(a) < $(y) 
for all functions $ : E n -> K such that: 

(1) > when x ± 0; 

(2) $(aar) = |a| for all real a; 

(3) $(x + y) < $(x) +$(?/); 

(4) $(^i, x n ) = ^{six w m, ...,e n x n ( n )) whenever each Si belongs to { — 1,1} 
and 7r is any permutation of {1, n) . 

For details, see [TS], Proposition B6, p. 160. 

It is worth noticing the connection between our definition of majorization and 
the subject of Schur convexity (as presented in [TB]): 

Theorem 4. Suppose that — Y^i=i ~^ SILi ^ n the global NPC space M — 
(M,d), and f : M n — > R is a continuous convex function invariant under the 
permutation of coordinates. Then 

f(xi, ...,x n ) < f(yi, ...,y„). 

Proof. For the sake of simplicity we will restrict here to the case where n = 3. 

According to the definition of majorization, if | X)i=i -< ^ELi^i' then 
there exists a doubly stochastic matrix A = (cty)lj=i sucn that 

n 

Xi = bax(^~^ ajjSyj) for i = 1, ...,n. 

3=1 
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As A can be uniquely represented under the form 

(Ai + A2 A3 + A5 A4 + Ag 
A 3 + A 4 Ai + A 6 A 2 + A 5 
A 5 + A 6 A 2 + A 4 Ai + A 3 

where all Xk are nonnegative and J^ fc _ 1 Xk = 1 (a simple matter of linear algebra) 
we can represent the elements Xj (IS 

xi = bar((Ai + A 2 )<5. yi + (A3 + A 4 )^ 2 + (A 5 + X 6 )S V3 ), 
x 2 = bar((A 3 + X 5 )S yi + (Ai + X 6 )5 y2 + (A 2 + A 4 )oy 3 ), 
x 3 = bar ((A 4 + A 6 )<S yi + (A 2 + X 5 )6 V2 + (A 4 + X 3 )6 V3 ) . 

It is easy to see that (xi, x 2 , x 3 ) is the barycenter of 

/i = ^1^(1/1,1/2,1/3) ^ 2 ^(Vl, !/3,!/2) + ^3 <^(y 2 , J/l,J/3) 
+ A4^(2/2,s/3,2/i) + ^5^(2/3, 2/1, y 3 ) + -^6^(1,3,1,2,1,!), 

so by Jensen's inequality and the symmetry of / we get 

/(a;i, ...,ir n ) < X 1 f(y 1 ,y 2 ,y 3 ) + A 2 /(yi, y 3 , 2/2) + X 3 f{y 2 , 2/1, y 3 ) 
+ A4/ (2/2,2/3,2/1) + A 5 /(2/3, 2/1,2/2) + A 6 / (2/3,2/2,2/1) 

= (Ai H h A 6 )/(2/i,2/2,2/3) = f(yi,V2,y 3 )- 

□ 

The following consequence of Theorem [4] relates the majorization of measures to 
the dispersion of their supports. 

Corollary 3. // ^EIU^* -< £ E"=i in the 9 lobal NPC s P ace M = ( M > d )> 
then 

l<i<j<n l<i<j<n 

/or every a > 1. 

Alert readers have probably already noticed that essential for the theory of ma- 
jorization presented above is the occurrence of the following two facts: 

(1) the existence of a unique minimizer for the functionals of the form 



= Xid 2 (x,Xi) 



(thought of as the barycenter bar(A) of the discrete probability measure 
X = J2i=i XiS Xi ); 
(2) the Jensen type inequality, 

/(bar(A)) < / /dA = 5^Ai/(*i), 

i—l 

for / in our class of generalized convex functions. 
The recent paper of Agueh and Carlier [1] shows that such a framework is 
available also in the case of certain Borel probability measures, equipped with 
the Wasserstein metric. More precisely they consider the space V 2 (R ) (of all 
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Borel probability measures on M. N having finite second moments) endowed with 
the Wasserstein metric, 

W 2 ( M ^)=inf ( [ \\x-yfd~,(x,y)) ' , 

where the infimum is taken over all Borel probability measures 7 on M. N x M. N with 
marginals /i and v. 

The barycenter bar(J^™ 1 Aj^J, of a discrete probability measure $D j=1 ^i^v t , is 
defined as the minimizer of the functional 

^ m 

i=l 

This minimizer is unique when at least one of the measures vi vanishes on every 
Borel set of Hausdorff dimension N — 1. See p], Proposition 2.2 and Proposition 
3.5. 

The natural class of convex function on the Wasserstein space is that of functions 
convex along bary centers. According to p], Definition 7.1, a function T : Vi (l w ) — > 
R is said to be convex along barycenters if for any discrete probability measure 
Y^iLi ^i^ui on 7-2 (B^) we have 

m m 

^(bar(^A i 5 i/i ))<^A i ^ i ). 

i=l i=l 

This notion of convexity coincides with the notion of displacement convexity 
introduced by McCann [16] if TV = 1, and is stronger than this in the general case. 
However, the main examples of displacement convex functions (such as the the 
internal energy, the potential energy and the interaction energy) are also examples 
of functions convex along barycenters. See pQ, Proposition 7.7. 

Theorem 5. The concept of majorization and all results noticed in the case of 
global NPC spaces (in particular, Theorem 3 and Theorem 4) remain valid in the 
context discrete probability measures on V 2 Q& N ) having unique barycenters and the 
functions J- : ) ~ ^ K convex along barycenters. 

We end our paper with an open problem that arises in connection to Rado's 
geometric characterization of majorization in R™ : (xi, x n ) -< (yi, y n ) in R™ if 
and only if (x±, ...,x n ) lies in the convex hull of the n! permutations of ...,y n ). 
See [15], Corollary B.3, p. 34. A relation of majorization of this kind can be 
introduced in the power space M n (of any global NPC space M = (M, d) as well 
as of P2$L N )) by putting 

(xi, ...,x n ) -< {yi, ...,y n ) 

^ n S"=i -< ^Y^i=i^yi- The proof of Theorem 0] yields immediately the ne- 
cessity part of Rado's characterization: if {x\, x n ) -< (yi,...,y n ) in M", then 
(xi, x n ) lies in the convex hull of the n! permutations of (j/j, y n ). Do the 
converse work? We know that the answer is positive if M is a Hilbert space but 
the general case remains open. 
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